9 research outputs found

    Spin-Transfer-Torque (STT) Devices for On-chip Memory and Their Applications to Low-standby Power Systems

    Get PDF
    With the scaling of CMOS technology, the proportion of the leakage power to total power consumption increases. Leakage may account for almost half of total power consumption in high performance processors. In order to reduce the leakage power, there is an increasing interest in using nonvolatile storage devices for memory applications. Among various promising nonvolatile memory elements, spin-transfer torque magnetic RAM (STT-MRAM) is identified as one of the most attractive alternatives to conventional SRAM. However, several design challenges of STT-MRAM such as shared read and write current paths, single-ended sensing, and high dynamic power are major challenges to be overcome to make it suitable for on-chip memories. To mitigate such problems, we propose a domain wall coupling based spin-transfer torque (DWCSTT) device for on-chip caches. Our proposed DWCSTT bit-cell decouples the read and the write current paths by the electrically-insulating magnetic coupling layer so that we can separately optimize read operation without having an impact on write-ability. In addition, the complementary polarizer structure in the read path of the DWCSTT device allows DWCSTT to enable self-referenced differential sensing. DWCSTT bit-cells improve the write power consumption due to the low electrical resistance of the write current path. Furthermore, we also present three different bit-cell level design techniques of Spin-Orbit Torque MRAM (SOT-MRAM) for alleviating some of the inefficiencies of conventional magnetic memories while maintaining the advantages of spin-orbit torque (SOT) based novel switching mechanism such as low write current requirement and decoupled read and write current path. Our proposed SOT-MRAM with supporting dual read/write ports (1R/1W) can address the issue of high-write latency of STT-MRAM by simultaneous 1R/1W accesses. Second, we propose a new type of SOT-MRAM which uses only one access transistor along with a Schottky diode in order to mitigate the area-overhead caused by two access transistors in conventional SOT-MRAM. Finally, a new design technique of SOT-MRAM is presented to improve the integration density by utilizing a shared bit-line structure

    Area Optimization Techniques for High-Density Spin-Orbit Torque MRAMs

    No full text
    This paper presents area optimization techniques for high-density spin-orbit torque magnetic random-access memories (SOT-MRAMs). Although SOT-MRAM has many desirable features of nonvolatility, high reliability and low write energy, it poses challenges to high-density memory implementation because of the use of two access transistors per cell. We first analyze the layout of the conventional SOT-MRAM bit-cell that includes two vertical metal lines, a bit-line and a source-line, limiting the horizontal dimension. We further propose two design techniques to reduce the horizontal dimension by decreasing the number of metal lines per cell without any performance overhead. Based on the fact that adjacent columns in a bit-interleaved array are not simultaneously accessed, the proposed techniques share a single source-line between two consecutive bit-cells in the same row. The simulation result shows that proposed techniques can achieve a bit-cell area reduction of 10–25% compared to the conventional SOT-MRAM. The comparison of our proposed designs with the standard spin-transfer torque MRAM shows 45% lower write energy, 84% lower read energy, and 2.3 × higher read-disturb margin

    High-Density 1R/1W Dual-Port Spin-Transfer Torque MRAM

    No full text
    Spin-transfer torque magnetic random-access memory (STT-MRAM) has several desirable features, such as non-volatility, high integration density, and near-zero leakage power. However, it is challenging to adopt STT-MRAM in a wide range of memory applications owing to the long write latency and a tradeoff between read stability and write ability. To mitigate these issues, an STT-MRAM bit cell can be designed with two transistors to support multiple ports, as well as the independent optimization of read stability and write ability. The multi-port STT-MRAM, however, is achieved at the expense of a higher area requirement due to an additional transistor per cell. In this work, we propose an area-efficient design of 1R/1W dual-port STT-MRAM that shares a bitline between two adjacent bit cells. We identify that the bitline sharing may cause simultaneous access conflicts, which can be effectively alleviated by using the bit-interleaving architecture with a long interleaving distance and the sufficient number of word lines per memory bank. We report various metrics of the proposed design based on the bit cell design using a 45 nm process. Compared to a standard single-port STT-MRAM, the proposed design shows a 15% lower read power and a 19% higher read-disturb margin. Compared with prior work on the 1R/1W dual-port STT-MRAM, the proposed design improves the area by 25%

    Domain Wall Coupling-Based STT-MRAM for On-Chip Cache Applications

    No full text
    10.1109/TED.2014.2377751IEEE TRANSACTIONS ON ELECTRON DEVICES622554-56

    High-Performance and Robust Binarized Neural Network Accelerator Based on Modified Content-Addressable Memory

    No full text
    The binarized neural network (BNN) is one of the most promising candidates for low-cost convolutional neural networks (CNNs). This is because of its significant reduction in memory and computational costs, and reasonable classification accuracy. Content-addressable memory (CAM) can perform binarized convolution operations efficiently since the bitwise comparison in CAM matches well with the binarized multiply operation in a BNN. However, a significant design issue in CAM-based BNN accelerators is that the operational reliability is severely degraded by process variations during match-line (ML) sensing operations. In this paper, we proposed a novel ML sensing scheme to reduce the hardware error probability. Most errors occur when the difference between the number of matches in the evaluation ML and the reference ML is small; thus, the proposed hardware identified cases that are vulnerable to process variations using dual references. The proposed dual-reference sensing structure has >49% less ML sensing errors than that of the conventional design, leading to a >1.0% accuracy improvement for Fashion MNIST image classification. In addition, owing to the parallel convolution operation of the CAM-based BNN accelerator, the proposed hardware achieved >34% processing-time improvement compared with that of the digital logic implementation

    Fast and Disturb-Free Nonvolatile Flip-Flop Using Complementary Polarizer MTJ

    No full text
    10.1109/TVLSI.2016.2631981IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS2541573-157

    High-Performance and Robust Binarized Neural Network Accelerator Based on Modified Content-Addressable Memory

    No full text
    The binarized neural network (BNN) is one of the most promising candidates for low-cost convolutional neural networks (CNNs). This is because of its significant reduction in memory and computational costs, and reasonable classification accuracy. Content-addressable memory (CAM) can perform binarized convolution operations efficiently since the bitwise comparison in CAM matches well with the binarized multiply operation in a BNN. However, a significant design issue in CAM-based BNN accelerators is that the operational reliability is severely degraded by process variations during match-line (ML) sensing operations. In this paper, we proposed a novel ML sensing scheme to reduce the hardware error probability. Most errors occur when the difference between the number of matches in the evaluation ML and the reference ML is small; thus, the proposed hardware identified cases that are vulnerable to process variations using dual references. The proposed dual-reference sensing structure has >49% less ML sensing errors than that of the conventional design, leading to a >1.0% accuracy improvement for Fashion MNIST image classification. In addition, owing to the parallel convolution operation of the CAM-based BNN accelerator, the proposed hardware achieved >34% processing-time improvement compared with that of the digital logic implementation

    Spin-Hall Magnetic Random-Access Memory With Dual Read/Write Ports for On-Chip Caches

    No full text
    10.1109/LMAG.2015.2422260IEEE MAGNETICS LETTERS

    High Performance and Energy-Efficient On-Chip Cache Using Dual Port (1R/1W) Spin-Orbit Torque MRAM

    No full text
    10.1109/JETCAS.2016.2547701IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS63293-30
    corecore